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Abstract 

The threshold model is a simple but classic model of contagion spreading in complex 
social systems. To capture the complex nature of social influencing we investigate 
numerically and analytically the transition in the behavior of threshold-limited cascades 
in the presence of multiple initiators as the distribution of thresholds is varied between 
the two extreme cases of identical thresholds and a uniform distribution. We accomplish 
this by employing a truncated normal distribution of the nodes’ thresholds and observe 
a non-monotonic change in the cascade size as we vary the standard deviation. Further, 
for a sufficiently large spread in the threshold distribution, the tipping-point behavior of 
the social influencing process disappears and is replaced by a smooth crossover governed 
by the size of initiator set. We demonstrate that for a given size of the initiator set, there 
is a specific variance of the threshold distribution for which an opinion spreads optimally. 
Furthermore, in the case of synthetic graphs we show that the spread asymptotically 
becomes independent of the system size, and that global cascades can arise just by the 
addition of a single node to the initiator set. 


Introduction 

The technological breakthroughs of the 21st century have strongly contributed to the 
emergence of network science, a multidisciplinary science with applications in many 
scientific fields and technologies. Several sociological opinion diffusion models first 
introduced in the middle of 20th century are now being thoroughly studied, while 
variations of these classical models have been introduced. Most of these models are based 
on social reinforcement, where simple rules based on the interaction of individuals with 
their respective nearest neighbors govern individual opinion evolution. The macroscopic 
outcome of these rules is a cascade of nodes switching opinions 11-9]. We focus our study 
on one of the classic models of social influencing, the Threshold Model (TM). The TM 
is a binary opinion spread model first introduced by Granovetter |2| to model collective 
behavior socially driven by peer pressure. Under the TM a node adopts a new opinion 
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only when the fraction of its nearest neighbors possessing that opinion is larger than 
an assigned threshold, which represents the resistance of the node to peer pressure J3]. 
Although the microscopic rule of opinion adoption in the TM is simple, the collective 
behavior that arises is complex and non-linear. The resulting spread size depends on a 
large set of parameters, such as the network structure (e.g., clustering) [7| 10 


13 , the 


size of the initially active nodes (initiators), the selection strategy of the initiators, and 
the distribution of threshold values among nodes of the network. The first thorough 
investigation of the TM was made by Watts [5], who examined the effect of one randomly 


selected initiator on the cascade size. Gleeson and Cahalane 14-16 , on the other hand, 


determined analytically the cascade size for varying initiator sizes (or fractions) for 
the infinite system size. Recent investigations of the TM by Karimi and Holme [l7] 
and Michalski et al. 18 also considered the impact of temporal networks on contagion 
cascades. Very recently, Ruan et al. [T9 1 studied the effects of “immune” individuals 
(those who resist adopting the new idea indefinitely) and external influencing (e.g., by 
mass media or advertisements) in the TM. 

An important problem in generalized models for social and biological contagion [20 - 22 
is to optimize the set of initiators, i.e., for a fixed cost (seed size), find the set of initiators 
giving rise to the largest cascade, or alternatively, find the minimum size seed set required 
to activate the entire network 23 . As far as selection strategies are concerned, Kempe 


et al. 24 showed that the optimization problem of selecting the most influential nodes 
in any directed weighted graph with uniform random selection of thresholds is NP-hard. 
They also suggested a greedy algorithm 24 , where each new initiator is selected based 


on the maximum spread it can cause, which unfortunately resulted in low efficiency of 
the algorithm. Chen et al. 25 designed a scalable algorithm (LDAG) which is based 
on the properties of directed acyclic graphs. Recently, Lim et al. [26] introduced a 
new node-level measure of influence, called cascade centrality (based on the size of the 
cascade resulting from the node being the only initiator), which may guide the selection 
of multiple initiators. Closely related to these studies and of practical interest is to find 
a set of initiators (not necessarily the smallest) in a scalable fashion that guarantees 
that the entire network will ultimately turn active, triggered by these initiators 


Their method was inspired by the fc-shell decomposition of the network 28 , which itself 


27 


can be an effective heuristic for selecting initiators in a broad class of models for the 
spreading of social or biological contagion 


21 


Singh et al. 10 studied the effect in the TM of varying the fraction of initiators 


on the cascade size for various basic heuristic selection strategies when each node has 
identical threshold in the network. They showed that there is a critical fraction of 
initiators (“tipping point”) at which a sharp (discontinuous) phase transition occurs 
from small to large cascades in Erdos-Renyi (ER) graphs [29]. This phase transition is 
apparent for the random, fc-shell, and degree-ranked selection strategies, which are listed 
in the increasing order of their performance. These findings, in particular, the emergence 
of the discontinuous transition, were analogous to those found by Baxter et al. 130 for 
bootstrap percolation (there, activation of a node requires k active neighbors). 

Watts [5], proposed the first analytic solution for the TM, using percolation theory 
and generating functions to measure the size of the largest cluster of nodes requiring 
only one active neighbor to turn active (largest vulnerable cluster). The model applies 
to unweighted, undirected graphs with small clustering coefficient. In the infinite system 
size, when the vulnerable cluster percolates, there is a non-zero probability that a cascade 
will take over a large portion of the network (global cascade). A randomly selected 
initiator will activate the largest vulnerable cluster, if it is a part of the cluster or is one 
of its neighbors. Using this analytic method, Watts studied the regime for which global 
cascades are possible for one initiator, for different values of identical thresholds </>o and 
average degree z of synthetic graphs. He found that, for ER graphs with 0(1) initiator 
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the criterion for global cascades is 2 < l/0o- 

Gleeson and Cahalane 14 formulated an analytic approach for the TM with varying 
initiator sizes. Their work was inspired by the zero-temperature Random-Field Ising 
Model (RFIM) (31 32 , where the cascade size, the initiator size and the threshold 


distribution correspond to the magnetization, the external uniform field and the local 
quenched random fields of the RFIM. The main difference between the two models is that 
in the TM the activated nodes remain activated, while in the RFIM the spins may flip 
back to an inactive state. The analytic approach to the TM model is applicable to locally 
tree-like structures 14 , such as ER graphs. The graph is considered an infinite-level tree 
with a level-by-level updating of the spread size, starting from the bottom of the tree. 

In most of the past research, the cascade size has been thoroughly investigated for a 
identical threshold in the network [5 10 ■ 13" , or for a random threshold for each node 24 


25 . However, a model with identical thresholds does not capture the complex nature 


of social influencing when multiple initiators are present. The small scale experiment 
conducted by Latane [2] and more recently an online experiment by Centola 33 and a 
large online study on Facebook data 34 suggest that individuals have diverse thresholds 
for adopting a newly introduced opinion. Here, to capture the diversity of opinion 
adoption thresholds in a social influence context, we study the effect of heterogeneous 
thresholds on the cascade size under the TM for empirical and synthetic unweighted 
and undirected networks for randomly selected initiators. 


Materials and Methods 


Simulations of the Threshold Model 

We assume that the thresholds are drawn from a truncated normal distribution with 
mean <f> 0 and standard deviation a. The threshold <p of each node is limited to interval 
[0,1], thus the mean threshold <f>o is also within this interval, and a is in the range of [0, 
0.288], boundaries of which correspond to the identical threshold and to the random 
threshold, respectively. Unlike, in the formulation of the threshold model in 14] 15], 
where thresholds drawn can be negative, allowing nodes to get spontaneously activated 
as innovators, and as a result randomizing the set of initiators, we are interested in the 
case where spread is initiated only with the insertion of randomly selected initiators in 
the network. 

Once a threshold for each node is set, for the simulations, we randomly assign initiators 
one by one and measure the cascade size. We repeat this process by drawing thresholds 
from the same distribution. The final cascade size for each threshold distribution is 
obtained by averaging one thousand times on different threshold distribution draws and, 
for the synthetic graphs, different network realizations. 


Network Structures 

The networks we use are undirected and unweighted. The synthetic networks used 
are Erdos-Renyi (ER) graphs and scale-free (SF) networks. For the generation of ER 


graphs 29 we used the G(N,per) model with N being the system size and per the 
probability that a random node will be connected to any node in the graph. The 
probability per is given by per = 2/ ( N — 1), where 2 is the nominal average degree in 
the network. We keep the average degree 2 = 10. For the generation of uncorrelated 
SF networks 35,36 (N = 10 4 , 2 = 10, with power law constant 7 = 3) we employ the 
configuration model 136,37 with a structural cut-off, and a maximum possible node 
degree set to VN, using a high accuracy look-up table from 38 . 
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The empirical networks used are a connected ego-network from a Facebook (FB) 
dataset, available from the Stanford Network Analysis Project (SNAP) [39] (system size 
N = 4048, average degree z = 43), and a high-school (HS) friendship network 40j. For 
the HS network, we only used the giant connected component of that network, with 
N = 921 and z = 5.96. The network contains two communities which are roughly equal in 
size (for more information on the two empirical networks see table in SI Text). Although 
SF, FB, and HS networks are connected networks, the generated ER graphs may have a 
disconnected component with probably e~ z , which for z = 10 is approximately 0.000045. 


Tree-like approximation for the Threshold Model 


For analytic methods, we apply Gleeson’s and Cahalane’s tree-like approximation for 
synthetic networks 14 15 . The approximation is given by the following set of equations 


oo k / j \ 

S eq =p+(l-p)J2 p kJ2 ( m )C( 1 -9oo) fc_m T(™) (1) 

k—1 m =1 ' ' 

*»-»+<! - *> E E (*; V p - *■)*"”"' F (j ) • (2) 

k—1 m—1 ^ ' 

In this approximation the graph is considered an infinite level tree. The spread diffuses 
level-by-level starting from the bottom of the tree. q n is defined “as the conditional 
probability that a node on level n is active, conditioned on its parent on level n + 1 
being inactive” and it is given by Eq. ([2]). The final spread S eq is given by Eq. |l]), and 
is measured at the top of the tree. The fraction of initially active nodes is given by p. 
In the bottom of the tree at level n = 0, the fraction of active nodes is only based on 
the initiators, thus q$ = p. The graph degree distribution is given by Pk, which for an 
infinite size ER graph is given by Pk = ( z k e~ z ) /k\, where z is the average degree, while 
for SF networks it’s given by Pf c ~ /c -7 . F (^r) is the cumulative probability that a 
node requires m or less active neighbors to get active, which depends on the assigned 
threshold distribution. 


Results 

First, we examine the effect of the standard deviation a on the cascade size S eq (averaged) 
for a constant initiator fraction and constant mean threshold <p 0 (Fig. [Tj). As <j increases 
so does a fraction of nodes whose threshold is far from the average causing a twofold 
effect. Of nodes far from average, the ones with thresholds below average are easily 
activated while those with thresholds above average are increasingly difficult to activate. 
Thus, when the initiator fraction is small, the cascade size S eq is monotonically increasing 
since the presence of larger fraction of low threshold nodes facilitate the spread. However, 
when the initiator fraction are large, the increase in low threshold nodes helps a little 
since they are likely to be already activated without the increase in a, but presence 
of additional high threshold nodes arrest the spread. This trade-off gives rise to the 
non-monotonic behavior seen in Fig. [T] which is apparent for different types of networks. 
Depending on the network structure and size of the initiators, the standard deviation 
a for which the spread is optimal varies. A visualization (Fig. [2]) shows time steps of 
the spread on a random selection of initiators with p = 0.20 in the FB network. For 
the same set of initiators, the spread for large sigma (a = 0.20) is much higher than for 
identical thresholds (<r = 0.00). Interestingly, in the vicinity of a ~ 0 the sharp decrease 
in the cascade size S eq occurs because with non-zero a, approximately half of the nodes 
acquire a threshold higher than <po = 0.50. For all the nodes with threshold <f> > 
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with even degree, even the slightest non-zero a value will increase the number of active 
neighbors by one, thus making cascades less likely to occur. Finally, for ER graphs 
[Fig-0 a)] and SF networks [Fig. 0b)] the analytic estimates are in good agreement 
with the simulations. 

In Fig0 the cascade size S eq is plotted for varying initiator sizes p for the same 
networks as in Fig. 0 As the initiator fraction increases, for small enough a there is 
a transition from small local cascades to large global cascades, which, for synthetic 
graphs is a discontinuous phase transition [Fig. 0] (a) and (b)]. However, the line of the 
average cascade size S eq appears smooth even in the presence of a discontinuous phase 
transition, because for each repetition the point of the discontinuous phase transition 
varies slightly. With increasing a the initiator fraction for which the transition occurs 
is reduced, while for the synthetic graphs the spread size still exhibits a discontinuous 
phase transition. With largely diverse thresholds we find that a critical initiator size 
beyond which cascades become global ceases to exist and the tipping-point behavior of 
the social influencing process disappears and is replaced by a smooth crossover governed 
by the size of initiator set. This property can be important, for example, for a company’s 
marketing strategy of a new product. If the threshold distribution is narrow enough, 
unless a critical initiator fraction is reached, there is a marginal local spread on a few of 
the first or second neighbor friends of the initiators. On the other hand, if the threshold 
distribution is wide, there is a significant spread. For the uniform random threshold 
distribution each addition of initiators has a reduced contribution to the cascade size as 


predicted by the submodularity property of the TM 24 


In Figs 0 and 0 we show that the behavior of the cascade size is largely independent 
of the system size N for any threshold distribution with the same degree distribution, 
for ER graphs and SF networks, respectively. We observe that with increasing system 
size N the cascade size S eq is asymptotically converging. 

We record the critical initiator fraction p c for which a discontinuous phase transition 
occurs for varying mean threshold <f>o (Fig. 0). For the measurement of p c , first we 
calculated the derivative of the S eq from Fig. 0 with respect to the initiator fraction 
p. The position of maximum of the derivative yields the p c , in other words, p c = 
arg maxp (dS eq (p) /dp). We used the same method for the calculation of the respective 
analytic estimates. We confine the threshold distribution for up to a = 0.15 to assess 
if there is a discontinuous phase transition with increasing initiators. Above each p c 
line global cascades occur. The value of p c decreases with increasing a. For identical 
thresholds </>o (in blue), the p c line has some sharp jumps, for example at 0o equal to 
0.50, 0.33, and 0.25 (Fig. 0. These jumps are artifacts of the discrete steps of the 
degree distribution in the presence of a unique threshold for all the nodes. In particular, 
microscopically, the number of active neighbors required for a node to turn active 
increases by integer values. For example, for a node with degree 10 and 0.40 < <j> < 0.50, 
that number is 5. For identical thresholds in the network, the cumulative effect of these 
integer steps gives rise to the jumps exhibited by the p c (0o) curves (Fig.0. Interestingly, 
this effect also shows in Fig. 0 where for large enough initiator fractions (i.e., p = 0.25 
or higher) the cascade size drops abruptly as a is increased from zero to small values. 
For nodes with mean threshold </>o = 0.50, even the smallest non zero increase on the 
standard deviation cr results in approximately half of the nodes having threshold larger 
than 0o = 0.50. The p c lines are lower for the ER graph compared to the SF networks 
because of the importance of a randomly selected very high degree node in SF networks 
can have on the spread. Our results obtained from simulations are in agreement with 
the analytic estimates. 

To further understand the effect of the standard deviation <r, we study the dynamics 
of the spread for synchronous updating of the nodes. In phase-space, as shown in Fig. 0 
the difference A S(n + 1) — A S(n) defines the number of nodes activated from time step 
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n to n + 1. The dynamic spread in the TM is deterministic and evolves in one direction, 
hence, the spread stops when the change on the cascade size (F-axis) reaches zero. 
Accordingly, the value of the cascade size in the steady state is indicated on the A'-axis. 
When cascades are not possible, the spread rate decreases monotonically. However, 
when cascades are possible then for up to some a the change is non-monotonic and 
the fractions of nodes in cascades reach almost one. But as cr’s grow larger and larger, 
these fractions stop growing farther and stay farther from one. When a approaches 
the standard deviation of uniform distribution the shape of the lines decreases linearly. 
Interestingly, similar behavior is observed for the FB and HS networks as well. 


Closed-form analytic estimate for the uniform threshold distribu¬ 
tion 


For a uniform threshold distribution the phase-space line decreases linearly for any 
initiator fraction for synthetic graphs and almost linearly for the empirical networks 
(Fig. §. In addition, we show for this threshold distribution, using Gleeson’s and 
Cahalane’s analytical methods, that the phase-space line has a closed form and is linearly 
decreasing. The extended proof of this is shown in SI Text. For a uniform threshold 
distribution the iterative formula in Eq. © of the analytic approximation yields the 
following closed-form solution 

q n+ i = p + bq n , (3) 


with b = (1 — p) - (z — 1 + Po). The solution of the above iterative equation with the 
initial condition q o = p , is 


Qn =p- 


1 - b n+1 
1-6 ' 


(4) 


According to 


16 , the spread at level n + 1 is given by 


oo k 

Sn +1 = h(q n ) = P + (1 - p) ^2 Pk 

k—1 m =1 


QC(1 - qn ) k ~ m F 



(5) 


which, in the case of a uniform distribution of thresholds (SI Text) simplifies to 


Sn+l — P + c 9ri) 


( 6 ) 


with c = (1 — p) (1 — Po), where the initial spread is So = p■ Using the above Eq. and 
Eq. (Sll) we can calculate (SI Text) the formula for the phase-space diagram 


Sn +1 - S n = cp - (1 - b)p - (1 - b)S n 


(7) 


The above Eq. is the closed form phase-space line of Fig. [8] On the other hand, at the 
equilibrium (as n —► oo) the spread size in Eq. [6] becomes 

eq P T cq Q o, (8) 

with qao = p-jvrg (SI Text). Note that in this approximation for uniform threshold 
distribution, the size of the final cascade for uncorrelated networks does not dependent 
on the details of the degree distribution, it only depends on the average degree 2 . 
In addition, it is easy to show that the derivative of the final cascade size [Eq. (|8|] 
with respect to the initiator size p is monotonically decreasing, in agreement with the 
submodularity property of the TM for the uniform threshold distribution [24|. 
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Discontinuous phase transitions in the threshold model 

To further understand the final cascade size behavior at the critical point for synthetic 
graphs, we are examining the system size dependence. The spread size at the equilibrium 
is independent of the method of the insertion of initiators, e.g., it does not matter 
whether the addition occurs in fractions or by individual addition of initiators. Using 
Monte-Carlo simulations, Singh 10 showed that the average cascade size is largely 


independent of the system size for the same initiator fraction for an identical threshold 
for ER graphs with unique degree distribution. We use the same approach to show that 
this is true for other threshold distributions for ER graphs (Fig. [4]) and SF networks 
(Fig. [5j) . These results indicate that given an initiator fraction po and an average cascade 
size S eq (po)i the addition of another initiator fraction p\ will cause the same change 
AS = S eq (po + Pi ) — S eq (po) hr the average cascade size S eq , largely independently of the 
system size, for large system sizes, for the same input degree and threshold distributions. 

Our analysis so far focused on the cascade size at the steady state S eq averaged over 
many realizations of networks, threshold values and assignment of initiators (Figs. [4] 
and [5]). To verify the presence and nature of phase transitions, we follow the approach 


presented in 30 . We start by measuring the increase of the cascade size of each sample 


in response to the one-by-one addition of initiators. If a discontinuous phase transition 
arises, at the critical point, the increase of the cascade size should remain constant and 
independent of the system size. To investigate this, let v be the current size of initiator 
set. For a given sample i, let A Si = Si(— Si(jj) denote the increase in the cascade 
size caused by the addition of a single randomly selected initiator to the current initiator 
set. Let (ASi) max (IV) be the maximum value of A Si (N) for all initiator sets of size 
Then, varying ct, we study how (AiS,;) lnax (IV) averaged over one thousand repetitions 
depends on the system size N (Fig. [ 9 ]) (solid lines). We observe that for the plotted cases 
with a = 0.00 and a = 0.24, ((AS)) max ) (IV) is independent of the system size. Moreover, 
the contribution of the rest of the initiators to the cascade tends to zero in the limit of 
infinite system size. However, for a = 0.26, ((A <S)) max ) (IV) decreases with the system 
size, indicating the absence of a discontinuous phase transition in the infinite system-size 
limit. Thus, there appears a qualitative change somewhere between a = 0.24 — 0.26. 

A similar analysis can be applied to the analytical estimation, with the tree-like 
approximation, of the increase in the cascade size (A STL)max(Sp) with a marginal 
addition of initiators. However, since the analytical estimation is set for an infinite 
system size, the one-by-one addition of initiators on larger and larger system sizes is not 
possible. Hence, we insert smaller and smaller fractions of initiators Sp. In Fig. [9] the 
top X axis is the fractional step increase of the number of initiators. For consistency, 
we include the corresponding increase in the cascade size ((ASj) max ) (Sp) that Sp , a 
fractional step increase of the number of initiators, measured through simulations. In 
this case, the minimum possible fraction of initiators is Sp = 1/N. We observe, that the 
results for the one-by-one addition of initiators with varying systems through simulations, 
agree with those for the fractional increase of an infinite system size with varying Sp. 
We conclude that it is between a = 0.24 — 0.26 (for </>q = 0.50) where the discontinuous 
phase transitions cease to emerge in the thermodynamic limit. 


Discussion 

Past experimental online studies [4 33 34 indicate the existence of diverse adoption 
thresholds of individuals in social networks. Prompted by this observation, we studied the 
impact of diversity of thresholds in spreading a new opinion, by intuitively assuming that 
the adoption thresholds are drawn from a truncated normal distribution. We explored 
this impact by using the threshold model, a reinforcement model which has lately drawn 
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significant attention in the scientific community. We showed that in the presence of a 
small spread (standard deviation) of the threshold distribution in a network, unless a 
critical initiator fraction is reached, the impact of the randomly selected initiators is 
small. Furthermore, we showed that, when discontinuous transitions in cascade size are 
possible for synthetic graphs, the addition of a single randomly-selected initiator can 
have a significant (global) impact on the final cascade size, i.e., the manifestation of 
the tipping point. However, with a sufficiently large spread in the individual thresholds 
(with the same mean), the cascade size exhibits a smooth transition, where the impact 
of each added initiator is reduced by the current size of the initiator set. Finally, we 
showed that in the case of a uniform threshold distribution, the spreading rate is linearly 
decreasing with the spread size for synthetic graphs and close to linearly decreasing for 
empirical graphs. In summary, our results indicate that information on the diversity of 
the thresholds is critically important for the understanding of the behavior of cascades in 
threshold-limited social contagion with multiple initiators. Most importantly, sufficiently 
large spread in the individual thresholds can change not only the quantitative aspects of 
triggering global cascades, but also the qualitative behavior of the system: the cascade 
size exhibits a smooth change (as opposed to a discontinuous jump) as a function of the 
fraction of initiators. 


Supporting Information 

SI text (pdf file) Includes basic statistics of the two empirical networks used and the 
closed-form analytical estimate for the case of the uniform distribution of thresholds. 


Acknowledgments 


The authors are grateful to Ferenc Molnar Jr. for his assistance on the generation of 
scale-free networks with the desired and accurate cutoffs and average degree 38]. Add 


Health was designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris, 
and funded by a grant P01-HD31921 from the National Institute of Child Health and 
Human Development, with cooperative funding from 17 other agencies. For data files 
contact Add Health, Carolina Population Center, 123 W. Franklin Street, Chapel Hill, 
NC27516-2524, addhealth@unc.edu. 


Author Contributions 

Conceived and designed the experiments: PDK SS BKS GK. Performed the experiments: 
PDK. Analyzed the data: PDK SS BKS GK. Wrote the paper: PDK SS BKS GK. 


References 

1. Schelling TC. Hockey helmets, concealed weapons, and daylight saving: A study 
of binary choices with externalities. J Confl Resolut. 1973; 17(3): 381-428. doi: 
10.1177/002200277301700302 

2. Granovetter M. Threshold Models of Collective Behavior. Am J Sociol. 1978; 
83(6): 1420-1443. doi: 10.1086/226707 

3. Nowak A, Szamrej J, Latane B. From Private Attitude to Public Opinion: A 
Dynamic Theory of Social Impact. Psychol Rev. 1990; 97(3): 362-376. doi: 
10.1037/0033-295X.97.3.362 


8/23 






4. Latane B, L’Herrou T. Spatial Clustering in the Conformity Game: Dynamic 
Social Impact in Electronic Groups. J Pers Soc Psychol. 1996; 70(6): 1218-1230. 
doi: 10.1037/0022-3514.70.6.1218 

5. Watts DJ. A simple model of global cascades on random networks. Proc Natl 
Acad Sci USA. 2002; 99(9): 5766-5771. doi: 10.1073/pnas.082090499 

6 . Watts DJ, Dodds PS. Influential, networks, and public opinion formation. J Cons 
Res. 2007; 34(4): 441-458. doi: 10.1086/518527 

7. Centola D, Eguiluz VM, Macy MW. Cascade dynamics of complex propagation. 
Physica A. 2007; 374(1): 449-456. doi: 10.1016/j.physa.2006.06.018 

8 . Xie J, Sreenivasan S, Korniss G, Zhang W, Lim C, Szymanski B. Social consensus 
through the influence of committed minorities. Phys Rev E. 2011; 84: 011130. doi: 
10.1103/PhysRevE.84.011130 

9. Ugander J, Backstrom L, Marlow C, Kleinberg J. Structural Diversity in So¬ 
cial Contagion. Proc Natl Acad Sci USA. 2012; 109(16): 5962-5966. doi: 
10.1073/pnas. 1116502109 

10. Singh P, Sreenivasan S, Szymanski BK, Korniss G. Threshold-limited spread¬ 
ing in social networks with multiple initiators. Sci Rep. 2013; 3: 2330. doi: 
10.1038/srep02330 

11. Ikeda Y, Hasegawa T, Nemoto K. Cascade dynamics on clustered network J Phys: 
Conf Ser. 2010; 221(1): 012005. doi: 10.1088/1742-6596/221/1/012005 

12. Lee KM, Brummitt CD, Goh KI. Threshold cascades with response heterogene¬ 
ity in multiplex networks Phys Rev E. 2014; 90: 062816. doi: 10.1103/Phys- 
RevE.90.062816 

13. Nematzadeh A, Ferrara E, Flammini A, Ahn YY. Optimal network modularity 
for information diffusion Phys Rev Lett. 2014; 113: 088701. doi: 10.1103/Phys- 
RevLett. 113.088701 

14. Gleeson JP, Cahalane DJ. Seed size strongly affects cascades on random networks. 
Phys RevE. 2007; 75: 056103. doi: 10.1103/PhysRevE.75.056103 

15. Gleeson JP, Cahalane DJ. An analytical approach to cascades on random networks. 
Proc SPIE. 2007; 6601; 66010W. doi: 10.1117/12.724525 

16. Gleeson JP. Cascades on correlated and modular random networks. Phys Rev E. 
2008; 77: 046117. doi: 10.1103/PhysRevE.77.046117 

17. Karimi F, Holme P. Threshold model of cascades in empirical temporal networks. 
Physica A. 2013; 392, 3476-3483. doi: 10.1016/j.physa.2013.03.050 

18. Michalski R, Kajdanowicz T, Brodka P, Kazienko P. Seed Selection for Spread 
of Influence in Social Networks: Temporal vs. Static Approach. New Generat 
Comput. 2014; 32(3-4), 213-235. doi: 10.1007/s00354-014-0402-9 

19. R.uan Z, Iniguez G, Karsai M, Kertesz J. Kinetics of Social Contagion. 
http://arxiv.org/abs/1506.00251 (Accessed: 09/27/2015). 

20. Dodds PS, Watts DJ. A generalized model of social and biological contagion. J 
Theor Biol. 2005; 232(4): 587-604. doi: 10.1016/j.jtbi.2004.09.006 


9/23 





21. Kitsak M, Gallos LK, Havlin S, Liljeros S, Muchnik L, Stanley HE, et al. Identifi¬ 
cation of influential spreaders in complex networks. Nat Phys. 2010; 6: 888-893. 
doi: 10.1038/nphysl746 

22. Karsai M, Iniguez G, Kaski K, Kertesz J. Complex contagion process in spreading 
of online innovation. J R Soc Interface. 2014; 11(101). doi: 10.1098/rsif.2014.0694 

23. Morone F, Makse HA. Collective influence optimization uncovers the 
strength of weak nodes in complex networks. Nature. 2015; 524: 65-68. 
doi: 10.1038/ naturel4604 

24. Kempe D, Kleinberg J, Tardos E. Maximizing the spread of influence through 
a social network. Proc of the 9th ACM Conf SIGKDD. 2003; 137 146. doi: 
10.1145/956750.956769 

25. Chen W, Yuan Y, Zhang L. Scalable Influence Maximization in Social Net¬ 
works under the Linear Threshold Model. IEEE ICDM. 2010; 88-97. doi: 
10.1109/ICDM.2010.118 

26. Lim Y, Ozdaglar A, Teytelboym A. A Simple Model of Cascades in Networks, 
(premiminary draft, 2015). 

27. Shakarian P, Eyre S, Paulo D. A scalable heuristic for viral marketing under the 
tipping model. Soc Netw Anal Min. 2003; 3: 1225-1248. doi: 10.1007/sl3278-013- 
0135-7 

28. Carrni S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E. A model of Internet topology 
using fc-shell decomposition. Proc Natl Acad Sci USA. 2007; 104(27): 11150-11154. 
doi: 10.1073/pnas. 0701175104 

29. Erdos P, Renyi A. On random graphs. Pub Math Debrecen. 1959; 6: 290-297. 

30. Baxter GJ, Dorogovtsev SN, Goltsev AV, Mendes JFF. Bootstrap percola¬ 
tion on complex networks. Phys Rev E. 2010; 82: 011103 doi: 10.1103/Phys- 
RevE.82.011103 

31. Dhar D, Shukla P, Sethna JP. Zero-temperature hysteresis in the random-field 
Ising model on a Bethe lattice. J Phys A. 1997; 30: 5259. doi: 10.1088/0305- 
4470/30/15/013 

32. Sethna JP, Dahmen K, Kartha S, Krumliansl JA, Roberts BW, Shore JD. Hystere¬ 
sis and hierarchies: Dynamics of disorder-driven first-order phase transformations. 
Phys Rev Lett. 1993; 70(21): 3347-3350. doi: 10.1103/PhysRevLett.70.3347 

33. Centola D. The Spread of Behavior in an Online Social Network Experiment. 
Science. 2010; 329(5996): 1194-1197. doi: 10.1126/science. 1185231 

34. State D, Adamic L. The Diffusion of Support in an Online Social Movement: 
Evidence from the Adoption of Equal-Sign Profile Pictures Proc of 18th ACM 
Conf CSCW. 2015; 1741-1750. doi: 10.1126/science.ll85231 

35. Barabasi AL, Albert R. Emergence of scaling in random networks. Science. 1999; 
286: 509-512. doi: 10.1126/science.286.5439.509 

36. Catanzaro M, Boguna M, Pastor-Satorras R. Generation of uncorrelated ran¬ 
dom scale-free networks. Phys Rev E. 2005; 71: 027103. doi: 10.1103/Phys- 
RevE.71.027103 


10/23 





37. Britton T, Deijfen M, Martin-Lof A. Generating simple random graphs with 
prescribed degree distribution. J Stat Math Phys. 2005; 124(6): 1377-1397. doi: 
10.1007/sl0955-006-9168-x 

38. Molnar F, Sreenivasan S, Szymanski BK, Korniss G. Minimum dominating sets in 
scale-free network ensembles. Sci Rep. 2003; 3: 1736. doi: 10.1038/srep01736 

39. Stanford Network Analysis Project (SNAP), http://snap.stanford.edu/data (Ac- 
cessed:04/23/2015). 

40. Add Health, http://www.cpc.unc.edu/projects/addhealth (Accessed: 09/27/2015). 


11/23 






— p = 0.05 — p = 0.10 — p = 0.15 — p = 0.20 — p = 0.25 


Figure 1. Behavior of the cascade size S eq at equilibrium for varying stan¬ 
dard deviation cr. (a) ER graphs with z = 10 and N = 10 4 ; (b) SF networks with 
2 = 10, 7 = 3, and N = 10 4 ; (c) high-school network with z = 5.96 and N = 921; 
(d) facebook network with z = 43 and N = 4039. The mean threshold is </>o = 0.50. 
The simulations are averaged over one thousand repetitions, (a) and (b) also show the 
analytic estimates (dotted lines) based on the tree-like approximation (see Materials and 
Methods) 


14 


12/23 






































Figure 2. Visualization of the spread of opinion in the TM model on a 
facebook network with z— 43 and 7V=4039. The fraction of the randomly selected 
initiators is p = 0.20. The mean threshold is <fio = 0.50 while the standard deviation of 
the threshold is (a) a = 0, (b) a = 0.20. Inactive nodes, initiators, and active nodes 
(through spreading) are marked with green, orange, and red, respectively. 
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Figure 3. Behavior of the cascade size S eq at equilibrium vs. the initiator 
fraction p. The networks are the same as in Fig. [l] (a) ER graphs with z = 10 and 
N = 10 4 ; (b) SF networks with z = 10, 7 = 3, and N = 10 4 ; (c) high-school network 
with z = 5.96 and N = 921; (d) facebook network with z = 43 and N = 4039. The 
mean threshold is (f>o = 0.50. (a) and (b) also shows the analytic estimates (dotted lines) 
based on the tree-like approximation (see Materials and Methods) 
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Figure 4. Finite-size behavior of the final cascade size S eq vs. the initiator 
fraction p for ER graphs with average degree z—10. The mean threshold is 
4> 0 = 0.50 while the standard deviation of the threshold is (a) a = 0.00, (b) a = 0.20, (c) 
a = 0.26 and (d) a = 0.28. 
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Figure 5. Finite-size behavior of the final cascade size S eq at vs. the initiator 
fraction p for SF networks with z— 10 and 7=3. The mean threshold is 4> o = 0.50 
while the standard deviation of the threshold is (a) a = 0.00, (b) a = 0.20, (c) a = 0.26, 
(d) g = 0.28. 
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Figure 6. Critical initiator fraction p c vs. mean threshold <fio. (a) ER graphs 
and (b) SF networks with 7 = 3 with average degree z = 10 and system size N = 10 4 . 
An initiator size above the p c line leads to global cascades. The analytic estimates 
(dotted lines) are based on the tree-like approximation 14 (see Materials and Methods). 
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Figure 7. Phase-space diagrams for a constant initiator fraction p = 0.15, and 

various standard deviations a = 0 (blue), a = 0.2 (green), a = 0.288 (red) for (a) ER 
graphs and (b) SF networks with 7 = 3, with z = 10 and N = 10 4 . The colored lines 
refer to a hundred independent repetitions, while the black lines are their averages. 
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Figure 8. Phase-space diagrams for the uniform random threshold distribu¬ 
tion (a = 0.288), for various initiator fractions p = 0.05 (blue), p = 0.15 (red) and 
p = 0.25 (green) for (a) ER graphs, (b) SF networks, (c) high-school network, and 
(d) facebook network as in Fig. [l] The solid lines and dotted lines (complete over¬ 
lap) correspond to the simulations and to the closed-form analytic estimates [Eq. (|2j)] , 
respectively. 
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Figure 9. Maximum contribution of initiators to the cascade size for various 
(7 values, (a) for ER graphs and (b) for SF networks with 7 = 3, for z = 10. Solid lines: 
((A5i) max )(iV) of 0(1) initiator with one-by-one addition of initiators for varying system 
sizes (bottom x-axis). Dashed lines: (( ASi) ma , x )(6p ) for various initiator fractions (top 
x-axis) for a constant system size N = 10 5 . Dotted lines: (ASrL)max(^p) for various 
initiator fractions (top x-axis) for the TL approximation. The mean threshold is kept at 
4> 0 = 0.50 in all cases. 
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Closed-form analytical estimate 

Here, we show explicitly the derivation of the closed form equation of the treelike 
approximation l,|2j of the fraction S n of active nodes at level n on Eq. (6) in the main 
text. According to [2| the level (or time) dependent evolution of the fraction q n +i of 
nodes with inactive parents at level n + 1 for synchronous updating of the nodes is given 

by 

m =1 ' ' 

and the fraction of active nodes at level n + 1 is given by 

o° k / , \ 

S n+1 =h(q n )=p+(l-p)J2PkJ2 ( m )c (1 -9n )fe ” m p(^). (S2) 

fc =1 m =1 ' 

The replacement of the cumulative probability function F (^) in the particular case of 
a uniform distribution of thresholds in the above two equations yields the closed form 
solution. Let a node i have degree k and an assigned threshold <f>. Vulnerability l is 
the absolute number of active neigbhors required for node i to get activated, and it is 
given by l = ceil(<^ x k). The cumulative probability distribution F (^) of nodes with 

m 

degree k, having vulnerability less or equal to to, is given by F(^) = ]Cn,fc, where 

fc=l 

rij~ is the probability that a node has vulnerability l, conditioned that it has degree 
k. For a uniform threshold distribution the probability that a node has vulnerability l , 
conditioned that it has degree k , is rn^ = l/k. For example, a node with degree k = 2 
will have vulnerability l = 1, with probability r( 1 , 2 ) = 1/2 and vulnerability l = 2 with 
probability 772 , 2 ) = 1/2- Thus, the fraction F (^) of nodes that have vulnerability m or 
less conditioned that they have degree k for the uniform random threshold distribution 
is given by 
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Qn+l 


^ A) 

= g(<ln) = p + (1 - p)2_, z 


k =1 
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Now, replacing Eq. (S3) in Eq. (SI) we show the linear relationship between the fraction 
q n +i of nodes with inactive parents at level n + 1 with the fraction q n at the previous 
level n of the approximated tree for networks with uniform distribution of thresholds 
(see Eq (3) in the main text). So, 

q n+1 =p+(l-p)f2^P k 55 (* 1 )c( 1 -?n) fc_1_m j, (S4) 

k —1 m =1 ' 

which simplifies to 

■ oo k ~i /l _ i\ 

q n+1 = p + {l-p)-^P k ^2 ( )Qn (! - Qn) k ~ 1 ~ m m. (S5) 

k —1 m —1 ' / 

However, 

(to) q » (1 “ TO = 55 Q) < i 1 ~ m - ( S6 ) 

m—1 ' ' m—0 ' ' 

where the right hand of the equation is the mean of the binomial distribution, and it is 
given by kq n 1 31, thus 

55 ( k m X ) C C 1 - 9n) fc_1_m m = (k — 1) q n (S7) 

m =1 ' ' 

Using the above equation in Eq. (S5) yields 

1 °° 

Qn+1 = V + (1 ~ P) ~ 55 P k ( fc ~ 1) Qn, (S8) 

fc =1 

which can be rewritten as 

Qn+i = p + (1 - p) * P fc (k - 1) + P 0 ^ Qn- (S9) 

OO 

Since the average degree is given by z = kP k , the above equation becomes 

k —o 

q n+ i = p + (1 -p) 1 (z - 1 + P 0 )q n - (S10) 

z 

which can be rewritten as 

q n+1 =p + bq n , (Sll) 

with b = (1 — p) - (z — 1 + Po). The solution of the above equation with inititial 
condition go = P is 

1 - b n+1 

Qn=P 1 _ b (S12) 

Similarly, replacing F (^) in Eq. (S2) formula by the right hand side of Eq. (S3), 

the analytic approximation yields 

s n+1 =p +(i -p)^p fe 55 ( TO )c( i -^) fe_m p (si3) 

k =1 m —1 ' ' 
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Using again the property of the mean of the binomial distribution the above equation 
reduces to 

1 

S n +i = p + (1 - p) ^2 p k t (kq n ), (S14) 

k =1 

which yields 

oo 

5 n+ i = p + (1 - p) q n ^2 p k- (S15) 

k =1 

Thus, the closed form solution of cascade size at level n + 1 is given by 

S n+1 =p + cq n , (S16) 

with c = (1 — p) (1 — Pq). Subtracting S n from both parts of the above equation and 
combining it with Eq. (Sll) we get 

Sn+1 S n = c(q n Qn—l) ■ (S17) 

Substituting q n = p + bq . n _i from Eq. (Sll) into the above equation yields 

S n +i ~ Sn = c(p+ (b- l)q n -i) ■ (S18) 

Solving Eq. (S16) for q n -\ at level n — 1 and substituting to the above equation yields 

S„ +1 -S„ = c(p+(&-l) • (S19) 

Expansion of the above equation yields to the closed form phase-space equation at Eq. (6) 
in the main text 

S n +i - S n = cp - (1 - b)p - (1 - b)S n . (S20) 

Now, going back to the calculation of S„+i at Eq. (S16), substituting q n with the right 
part of Eq. (S12) yields 

b n+1 - 1 

S n + i = p + cp —p > (S21) 

where the cascade size Sq at level n = 0 is just the fraction of the initiators, So = p. On 
the other hand, in the equilibrium state (as n —> oo) the cascade size S eq is given by 

S eq = p + cp^—^, (S22) 

since 0 < b < 1. Interestingly, the final cascade size doesn’t depend for uncorrelated 
networks on the degree distribution, but only on the avearage degree z. 
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